Goto

Collaborating Authors

 empirical covariance matrix




A EM-algorithm to fit LDF A-H (Section 2) Initialization Let null θ

Neural Information Processing Systems

Since the MPLE objective function for LDFA-H given in Eq. (9) is not guaranteed convex, an EM-algorithm may find a local minimum according to a choice of the initial value. Hence a good initialization is crucial to a successful estimation. According to the equivalence between CCA and probablistic CCA shown by A. Anonymous, it gives (r 1) (r 1) (r 1) (r 1) Lasso problem is solved by the P-GLASSO algorithm by Mazumder et al. (2010). We simulated realistic data with known cross-region connectivity as follows. Notice that the amplitudes of the top four factors dominate the others.


Bayesian neural networks with interpretable priors from Mercer kernels

Alberts, Alex, Bilionis, Ilias

arXiv.org Machine Learning

Quantifying the uncertainty in the output of a neural network is essential for deployment in scientific or engineering applications where decisions must be made under limited or noisy data. Bayesian neural networks (BNNs) provide a framework for this purpose by constructing a Bayesian posterior distribution over the network parameters. However, the prior, which is of key importance in any Bayesian setting, is rarely meaningful for BNNs. This is because the complexity of the input-to-output map of a BNN makes it difficult to understand how certain distributions enforce any interpretable constraint on the output space. Gaussian processes (GPs), on the other hand, are often preferred in uncertainty quantification tasks due to their interpretability. The drawback is that GPs are limited to small datasets without advanced techniques, which often rely on the covariance kernel having a specific structure. To address these challenges, we introduce a new class of priors for BNNs, called Mercer priors, such that the resulting BNN has samples which approximate that of a specified GP. The method works by defining a prior directly over the network parameters from the Mercer representation of the covariance kernel, and does not rely on the network having a specific structure. In doing so, we can exploit the scalability of BNNs in a meaningful Bayesian way.


MMGP_supplementary_material

fabie

Neural Information Processing Systems

Details regarding the datasets are provided in Appendix A. Morphing strategies and dimensionality Regarding the AirfRANS dataset, the reader is referred to [14]. Examples of input geometries are shown in Figure 6 together with the associated output pressure fields. The output scalars of the problem are obtained by post-processing the three-dimensional velocity. Examples of input geometries are shown in Figure 7. Figure 8: ( Tensile2d) Illustration of the Tutte's barycentric mapping used in the morphing stage. Notice that although these morphing techniques are called "mesh A zoom of the RBF morphing close to the airfoil for test sample 787 is illustrated in Figure 10.



A Miscellaneous Results and Supporting

Neural Information Processing Systems

A.1 Properties of Stable Distributions We will use the following property of stable distributions: Lemma A.1. By integrating the tail bound from the previous result, we get the following simple corollary. Corollary A.2. F or fixed 0 0 .


CoVariance Filters and Neural Networks over Hilbert Spaces

Battiloro, Claudio, Cavallo, Andrea, Isufi, Elvin

arXiv.org Artificial Intelligence

CoVariance Neural Networks (VNNs) perform graph convolutions on the empirical covariance matrix of signals defined over finite-dimensional Hilbert spaces, motivated by robustness and transferability properties. Yet, little is known about how these arguments extend to infinite-dimensional Hilbert spaces. In this work, we take a first step by introducing a novel convolutional learning framework for signals defined over infinite-dimensional Hilbert spaces, centered on the (empirical) covariance operator. We constructively define Hilbert coVariance Filters (HVFs) and design Hilbert coVariance Networks (HVNs) as stacks of HVF filterbanks with nonlinear activations. We propose a principled discretization procedure, and we prove that empirical HVFs can recover the Functional PCA (FPCA) of the filtered signals. We then describe the versatility of our framework with examples ranging from multivariate real-valued functions to reproducing kernel Hilbert spaces. Finally, we validate HVNs on both synthetic and real-world time-series classification tasks, showing robust performance compared to MLP and FPCA-based classifiers.


A EM-algorithm to fit LDF A-H (Section 2) Initialization Let null θ

Neural Information Processing Systems

Since the MPLE objective function for LDFA-H given in Eq. (9) is not guaranteed convex, an EM-algorithm may find a local minimum according to a choice of the initial value. Hence a good initialization is crucial to a successful estimation. According to the equivalence between CCA and probablistic CCA shown by A. Anonymous, it gives (r 1) (r 1) (r 1) (r 1) Lasso problem is solved by the P-GLASSO algorithm by Mazumder et al. (2010). We simulated realistic data with known cross-region connectivity as follows. Notice that the amplitudes of the top four factors dominate the others.


The noise level in linear regression with dependent data

Ziemann, Ingvar, Tu, Stephen, Pappas, George J., Matni, Nikolai

arXiv.org Machine Learning

Ordinary least squares (OLS) regression from a finite sample is one of the most ubiquitous and widely used technique in machine learning. When faced with independent data, there are now sharp tools available to analyze its success optimally under relatively general assumptions. Indeed, a non-asymptotic theory matching the classical asymptotically optimal understanding from statistics [van der Vaart, 2000] has been developed over the last decade [Hsu et al., 2012, Oliveira, 2016, Mourtada, 2022]. However, once we relax the independence assumption and move toward data that exhibits correlations, the situation is much less well-understood--even for a problem as seemingly simple as linear regression. While sharp asymptotics are available through various limit theorems, there are no general results matching these in the finite sample regime. In this paper, we study the instance-specific performance of ordinary least squares in a setting with dependent data--and in contrast to much contemporary work on the theme--without imposing realizability.